14 research outputs found
Self-Paced Multitask Learning with Shared Knowledge
This paper introduces self-paced task selection to multitask learning, where
instances from more closely related tasks are selected in a progression of
easier-to-harder tasks, to emulate an effective human education strategy, but
applied to multitask machine learning. We develop the mathematical foundation
for the approach based on iterative selection of the most appropriate task,
learning the task parameters, and updating the shared knowledge, optimizing a
new bi-convex loss function. This proposed method applies quite generally,
including to multitask feature learning, multitask learning with alternating
structure optimization, etc. Results show that in each of the above
formulations self-paced (easier-to-harder) task selection outperforms the
baseline version of these methods in all the experiments
CLUSTER-BASED TERM WEIGHTING AND DOCUMENT RANKING MODELS
A term weighting scheme measures the importance of a term in a collection. A document ranking model uses these term weights to find the rank or score of a document in a collection. We present a series of cluster-based term weighting and document ranking models based on the TF-IDF and Okapi BM25 models. These term weighting and document ranking models update the inter-cluster and intra-cluster frequency components based on the generated clusters. These inter-cluster and intra-cluster frequency components are used for weighting the importance of a term in addition to the term and document frequency components. In this thesis, we will show how these models outperform the TF-IDF and Okapi BM25 models in document clustering and ranking
Eye of the Beholder: Improved Relation Generalization for Text-based Reinforcement Learning Agents
Text-based games (TBGs) have become a popular proving ground for the
demonstration of learning-based agents that make decisions in quasi real-world
settings. The crux of the problem for a reinforcement learning agent in such
TBGs is identifying the objects in the world, and those objects' relations with
that world. While the recent use of text-based resources for increasing an
agent's knowledge and improving its generalization have shown promise, we posit
in this paper that there is much yet to be learned from visual representations
of these same worlds. Specifically, we propose to retrieve images that
represent specific instances of text observations from the world and train our
agents on such images. This improves the agent's overall understanding of the
game 'scene' and objects' relationships to the world around them, and the
variety of visual representations on offer allow the agent to generate a better
generalization of a relationship. We show that incorporating such images
improves the performance of agents in various TBG settings
Targeted Advertising on Social Networks Using Online Variational Tensor Regression
This paper is concerned with online targeted advertising on social networks.
The main technical task we address is to estimate the activation probability
for user pairs, which quantifies the influence one user may have on another
towards purchasing decisions. This is a challenging task because one marketing
episode typically involves a multitude of marketing campaigns/strategies of
different products for highly diverse customers. In this paper, we propose what
we believe is the first tensor-based contextual bandit framework for online
targeted advertising. The proposed framework is designed to accommodate any
number of feature vectors in the form of multi-mode tensor, thereby enabling to
capture the heterogeneity that may exist over user preferences, products, and
campaign strategies in a unified manner. To handle inter-dependency of tensor
modes, we introduce an online variational algorithm with a mean-field
approximation. We empirically confirm that the proposed TensorUCB algorithm
achieves a significant improvement in influence maximization tasks over the
benchmarks, which is attributable to its capability of capturing the
user-product heterogeneity.Comment: 18 pages, 7 figure
Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines
Text-based games have emerged as an important test-bed for Reinforcement
Learning (RL) research, requiring RL agents to combine grounded language
understanding with sequential decision making. In this paper, we examine the
problem of infusing RL agents with commonsense knowledge. Such knowledge would
allow agents to efficiently act in the world by pruning out implausible
actions, and to perform look-ahead planning to determine how current actions
might affect future world states. We design a new text-based gaming environment
called TextWorld Commonsense (TWC) for training and evaluating RL agents with a
specific kind of commonsense knowledge about objects, their attributes, and
affordances. We also introduce several baseline RL agents which track the
sequential context and dynamically retrieve the relevant commonsense knowledge
from ConceptNet. We show that agents which incorporate commonsense knowledge in
TWC perform better, while acting more efficiently. We conduct user-studies to
estimate human performance on TWC and show that there is ample room for future
improvement
On the Convergence and Sample Complexity Analysis of Deep Q-Networks with -Greedy Exploration
This paper provides a theoretical understanding of Deep Q-Network (DQN) with
the -greedy exploration in deep reinforcement learning. Despite
the tremendous empirical achievement of the DQN, its theoretical
characterization remains underexplored. First, the exploration strategy is
either impractical or ignored in the existing analysis. Second, in contrast to
conventional Q-learning algorithms, the DQN employs the target network and
experience replay to acquire an unbiased estimation of the mean-square Bellman
error (MSBE) utilized in training the Q-network. However, the existing
theoretical analysis of DQNs lacks convergence analysis or bypasses the
technical challenges by deploying a significantly overparameterized neural
network, which is not computationally efficient. This paper provides the first
theoretical convergence and sample complexity analysis of the practical setting
of DQNs with -greedy policy. We prove an iterative procedure with
decaying converges to the optimal Q-value function geometrically.
Moreover, a higher level of values enlarges the region of
convergence but slows down the convergence, while the opposite holds for a
lower level of values. Experiments justify our established
theoretical insights on DQNs
MISMATCH: Fine-grained Evaluation of Machine-generated Text with Mismatch Error Types
With the growing interest in large language models, the need for evaluating
the quality of machine text compared to reference (typically human-generated)
text has become focal attention. Most recent works focus either on
task-specific evaluation metrics or study the properties of machine-generated
text captured by the existing metrics. In this work, we propose a new
evaluation scheme to model human judgments in 7 NLP tasks, based on the
fine-grained mismatches between a pair of texts. Inspired by the recent efforts
in several NLP tasks for fine-grained evaluation, we introduce a set of 13
mismatch error types such as spatial/geographic errors, entity errors, etc, to
guide the model for better prediction of human judgments. We propose a neural
framework for evaluating machine texts that uses these mismatch error types as
auxiliary tasks and re-purposes the existing single-number evaluation metrics
as additional scalar features, in addition to textual features extracted from
the machine and reference texts. Our experiments reveal key insights about the
existing metrics via the mismatch errors. We show that the mismatch errors
between the sentence pairs on the held-out datasets from 7 NLP tasks align well
with the human evaluation.Comment: Accepted at ACL 2023 (ACL Findings Long